Synthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition

نویسندگان

David C. Wyld

Inma Mohino-Herranz

Roberto Gil-Pita

Sagrario Alonso-Diaz

Manuel Rosa-Zurera

چکیده

Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficult, due to the generalization problems that arise under these conditions. In this work we propose a solution to this problem consisting in enlarging the training set through the creation the new virtual patterns. In the case of emotional speech, most of the emotional information is included in speed and pitch variations. So, a change in the average pitch that does not modify neither the speed nor the pitch variations does not affect the expressed emotion. Thus, we use this prior information in order to create new patterns applying a pitch shift modification in the feature extraction process of the classification system. For this purpose, we propose a frequency scaling modification of the Mel Frequency Cepstral Coefficients, used to classify the emotion. This proposed process allows us to synthetically increase the number of available patterns in thetraining set, thus increasing the generalization capability of the system and reducing the test error.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Speech Emotion Recognition Using Residual Phase and MFCC Features

Abstract--The main objective of this research is to develop a speech emotion recognition system using residual phase and MFCC features with autoassociative neural network (AANN). The speech emotion recognition system classifies the speech emotion into predefined categories such as anger, fear, happy, neutral or sad. The proposed technique for speech emotion recognition (SER) has two phases : Fe...

متن کامل

Emotion Recognition using Dynamic Time Warping Technique for Isolated Words

Emotion recognition helps to recognize the internal expressions of the individuals from the speech database. In this paper, Dynamic time warping (DTW) technique is utilized to recognize speaker independent Emotion recognition based on 39 MFCC features. A large audio of around 960 samples of isolated words of five different emotions are collected and recorded at 20 to 300 KHz sampling frequency....

متن کامل

MFCC based Enlargement of the Training Set for Emotion Recognition in Speech

Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, denominated as “prosody”, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficu...

متن کامل

Inferring the Human Emotional State of Mind using Assymetric Distrubution

This present paper highlights a methodology for Emotion Recognition based on Skew Symmetric Gaussian Mixture Model classifier and MFCC-SDC ceptral coefficients as the features for the recognition of various emotions from the generated data-set of emotional voices belonging to students of both genders in GITAM University. For training and testing of the developed methodology, the data collection...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Synthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition

نویسندگان

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Speech Emotion Recognition Using Residual Phase and MFCC Features

Emotion Recognition using Dynamic Time Warping Technique for Isolated Words

MFCC based Enlargement of the Training Set for Emotion Recognition in Speech

Inferring the Human Emotional State of Mind using Assymetric Distrubution

عنوان ژورنال:

اشتراک گذاری